NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

De Finetti’s theorem and related results for infinite weighted exchangeable sequences

https://doi.org/10.3150/23-BEJ1704

Barber, Rina Foygel; Candès, Emmanuel J; Ramdas, Aaditya; Tibshirani, Ryan J (November 2024, Bernoulli)

Full Text Available
Cross-prediction-powered inference

https://doi.org/10.1073/pnas.2322083121

Zrnic, Tijana; Candès, Emmanuel J (April 2024, Proceedings of the National Academy of Sciences)

While reliable data-driven decision-making hinges on high-quality labeled data, the acquisition of quality labels often involves laborious human annotations or slow and expensive scientific measurements. Machine learning is becoming an appealing alternative as sophisticated predictive techniques are being used to quickly and cheaply produce large amounts of predicted labels; e.g., predicted protein structures are used to supplement experimentally derived structures, predictions of socioeconomic indicators from satellite imagery are used to supplement accurate survey data, and so on. Since predictions are imperfect and potentially biased, this practice brings into question the validity of downstream inferences. We introduce cross-prediction: a method for valid inference powered by machine learning. With a small labeled dataset and a large unlabeled dataset, cross-prediction imputes the missing labels via machine learning and applies a form of debiasing to remedy the prediction inaccuracies. The resulting inferences achieve the desired error probability and are more powerful than those that only leverage the labeled data. Closely related is the recent proposal of prediction-powered inference [A. N. Angelopoulos, S. Bates, C. Fannjiang, M. I. Jordan, T. Zrnic,Science382, 669–674 (2023)], which assumes that a good pretrained model is already available. We show that cross-prediction is consistently more powerful than an adaptation of prediction-powered inference in which a fraction of the labeled data is split off and used to train the model. Finally, we observe that cross-prediction gives more stable conclusions than its competitors; its CIs typically have significantly lower variability.
more » « less
Full Text Available
Learn then test: Calibrating predictive algorithms to achieve risk control

https://doi.org/10.1214/24-AOAS1998

Angelopoulos, Anastasios N; Bates, Stephen; Candès, Emmanuel J; Jordan, Michael I; Lei, Lihua (June 2025, The Annals of Applied Statistics)

Free, publicly-accessible full text available June 1, 2026
Sensitivity analysis of individual treatment effects: A robust conformal inference approach

https://doi.org/10.1073/pnas.2214889120

Jin, Ying; Ren, Zhimei; Candès, Emmanuel J. (February 2023, Proceedings of the National Academy of Sciences)

We propose a model-free framework for sensitivity analysis of individual treatment effects (ITEs), building upon ideas from conformal inference. For any unit, our procedure reports the Γ-value, a number which quantifies the minimum strength of confounding needed to explain away the evidence for ITE. Our approach rests on the reliable predictive inference of counterfactuals and ITEs in situations where the training data are confounded. Under the marginal sensitivity model of [Z. Tan, J. Am. Stat. Assoc. 101, 1619-1637 (2006)], we characterize the shift between the distribution of the observations and that of the counterfactuals. We first develop a general method for predictive inference of test samples from a shifted distribution; we then leverage this to construct covariate-dependent prediction sets for counterfactuals. No matter the value of the shift, these prediction sets (resp. approximately) achieve marginal coverage if the propensity score is known exactly (resp. estimated). We describe a distinct procedure also attaining coverage, however, conditional on the training data. In the latter case, we prove a sharpness result showing that for certain classes of prediction problems, the prediction intervals cannot possibly be tightened. We verify the validity and performance of the methods via simulation studies and apply them to analyze real datasets.
more » « less
Full Text Available
Conformal prediction beyond exchangeability

https://doi.org/10.1214/23-AOS2276

Barber, Rina Foygel; Candès, Emmanuel J.; Ramdas, Aaditya; Tibshirani, Ryan J. (April 2023, The Annals of Statistics)

Full Text Available
The asymptotic distribution of the MLE in high-dimensional logistic models: Arbitrary covariance

https://doi.org/10.3150/21-BEJ1401

Zhao, Qian; Sur, Pragya; Candès, Emmanuel J (August 2022, Bernoulli)

Full Text Available
Conformal Inference of Counterfactuals and Individual Treatment Effects

https://doi.org/10.1111/rssb.12445

Lei, Lihua; Candès, Emmanuel J. (October 2021, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Abstract Evaluating treatment effect heterogeneity widely informs treatment decision making. At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms. While these methods enjoy some theoretical appeal in terms of consistency and convergence rates, they generally perform poorly in terms of uncertainty quantification. This is troubling since assessing risk is crucial for reliable decision-making in sensitive and uncertain environments. In this work, we propose a conformal inference-based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework. For completely randomized or stratified randomized experiments with perfect compliance, the intervals have guaranteed average coverage in finite samples regardless of the unknown data generating mechanism. For randomized experiments with ignorable compliance and general observational studies obeying the strong ignorability assumption, the intervals satisfy a doubly robust property which states the following: the average coverage is approximately controlled if either the propensity score or the conditional quantiles of potential outcomes can be estimated accurately. Numerical studies on both synthetic and real data sets empirically demonstrate that existing methods suffer from a significant coverage deficit even in simple models. In contrast, our methods achieve the desired coverage with reasonably short intervals.
more » « less
Predictive inference with the jackknife+

https://doi.org/10.1214/20-AOS1965

Barber, Rina Foygel; Candès, Emmanuel J.; Ramdas, Aaditya; Tibshirani, Ryan J. (February 2021, The Annals of Statistics)
null (Ed.)
Full Text Available
The limits of distribution-free conditional predictive inference

https://doi.org/10.1093/imaiai/iaaa017

Foygel Barber, Rina; Candès, Emmanuel J; Ramdas, Aaditya; Tibshirani, Ryan J (August 2020, Information and Inference: A Journal of the IMA)
null (Ed.)
Abstract We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally. Existing methods such as conformal prediction offer marginal coverage guarantees, where predictive coverage holds on average over all possible test points, but this is not sufficient for many practical applications where we would like to know that our predictions are valid for a given individual, not merely on average over a population. On the other hand, exact conditional inference guarantees are known to be impossible without imposing assumptions on the underlying distribution. In this work, we aim to explore the space in between these two and examine what types of relaxations of the conditional coverage property would alleviate some of the practical concerns with marginal coverage guarantees while still being possible to achieve in a distribution-free setting.
more » « less
Full Text Available
Robust inference with knockoffs

https://doi.org/10.1214/19-AOS1852

Barber, Rina Foygel; Candès, Emmanuel J.; Samworth, Richard J. (June 2020, Annals of Statistics)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records